Multiplex Method and Associated Functional Data Structure for Combining Digital Video Signals

ABSTRACT

Functional data structure for coding a set of digital images which are divided into macroblocks ofpixels coded with colour value statements, including intraprediction macroblocks, wherein each of the images is compressed into at least-one first data stream portion, which comprises a portion of the macroblocks reduced by physical redundancies, and-one second data stream portion, which describes the redundancies, wherein for the intraprediction macroblocks-the first data stream portion is reduced by colour value statements with correlations to colour values from rows of pixels outside and at one edge of the intraprediction macroblock and for which, in the case of pixels outside the compressed image, a colour value default is assumed,-and the second data stream comprises intrapredictors for describing the correlations, with coding of an area which is divided into first areas, each of which is occupied by the macroblocks of one of the digital images, and a second area which spaces apart the first areas and is occupied by pixels with the colour value default.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is the United States National Phase under 35 U.S.C. §371 of PCT International Application No. PCT/EP2010/001253, filed on Mar.1, 2010, and claiming priority to German Application No. 10 2009 011 251.0, filed on Mar. 2, 2009.

BACKGROUND OF THE INVENTION

1. Field of Invention

Embodiments are directed to a functional data structure and a method for generating the functional data structure. The functional data structure and the method are suitable for a multiplex method for combining digital video signals. Embodiments also relate to an encoding software, as well as data carriers and data processing devices.

2. Background of the Related Art

A digital image consists of a data set that comprises tuples containing specifications for spatial positions and color values of pixels.

The color values used are often color values from the RGB (red-green-blue) color space, especially the RGB color space with values from 0 to 255 per color channel. Alternatively, the YUV color space is often used, in which the color values are divided into a luminance signal Y and two chrominance signals U and V.

In order to offer practical handling for processes such as storage and data transfers via a local communication network or telecommunication network, digital images are compressed.

In data processing, especially with compression algorithms, according to the current state of the art, digital images are divided into macroblocks that consist of pixels encoded with color value specifications and include implicit position specifications. The macroblocks are then used in analyzing redundancies or defining compression units of the digital image.

A compression writes data in at least

-   -   a first data stream portion, which comprises a portion of a data         set reduced by redundancies, and     -   a second data stream portion that is assigned to the first data         stream portion and describes the redundancies.

The first and second data stream portions can preferably be encoded together in one data stream. The first and second data stream portions can also be split into two data streams.

In this way, the original data from the first data stream portion can be regenerated based on the second data stream portion with reintroduction of the redundancies.

Correspondingly, digital images divided into macroblocks can be compressed in

-   -   a first data stream portion, which comprises a portion of the         macroblocks, and preferably all macroblocks, reduced by at least         spatial redundancies, and     -   a second data stream portion that is assigned to the first data         stream portion and describes the redundancies.

In lossy data compressions, data are removed that can be considered non-essential when reconstructed for human use. They are not regenerated during decompression. Such lossy data compressions are used in generating MP3 files, for example. Highly compressed file formats like MP3 are based on combinations of lossless and lossy data compressions.

One way to remove spatial redundancies from image data for compression is intraprediction, as described in U.S. Pat. No. 7,386,048, for example, in which examples of intrapredictions and intrapredictors are described.

In the aforementioned compressions of digital images split into macroblocks in first and second data stream portions, intraprediction macroblocks can be used for the compression, in which

-   -   the first data stream portion is reduced by color value         specifications with correlations to color values from at least         one row of pixels, which is located outside and at one edge of         the intraprediction macroblock, and     -   the second data stream portion comprises intrapredictors for         describing the correlations.

The correlations are, in particular, identity of the color values of pixels or similarities of the color values of pixels with those in the rows of pixels that lie outside and on one edge of the intraprediction macroblock.

The intrapredictors thereby generate instructions to assume color values of correlated pixels from this row or these rows of pixels for a pixel in the intraprediction macroblock.

If a digital image that is compressed does not itself contain any such rows of pixels, i.e., if an intraprediction macroblock has an edge in common with the edge of the digital image, then standard compression algorithms assume a row of pixels with a default color value.

This state-of-the-art aspect of image compression is illustrated in FIGS. 1 a and 1 b.

FIG. 1 a shows three same-size macroblocks within an edge 4 of a digital image: A first macroblock 1 lies in the upper left corner of the digital image. A second macroblock 2 lies between the first macroblock and a third macroblock 3. Adjacent to the third macroblock 3 is an L-shaped area 5 of pixels that are used for a DC intraprediction. This means that, for a compression with this DC intraprediction, all of the pixels in the L-shaped area 5 that lie outside of the edge 4—i.e., in FIG. 1 a lie above the third macroblock 3—are assumed to have a default color value corresponding to the medium gray value 128 from the RGB color space or the YUV color space with values of 0 to 255.

FIG. 1 b shows an arrangement of the corresponding compressed macroblock 3 in a larger scope, which is part of a digital image. In this image, above the row with the first, second, and third macroblocks 1, 2, and 3 shown in FIG. 1, additional macroblocks 6 and 7 are arranged in a row. When the third macroblock 3 is decompressed, with the help of the intrapredictors from the second data stream portion, pixels in this block are assigned color values of pixels in the macroblock 7 that overlaps the L-shaped area 5 shown in FIG. 1 b. For this, assignments are made for which, in place of the medium gray value described for a pixel when compressed, another color value from a pixel in the overlapping macroblock 7 is assigned. This causes decompression errors.

If pixels from the macroblock 3 decompressed with errors are used for further intrapredictions in compressing or decompressing the digital image, the error can be compounded over large portions of the digital image.

If pixels from the macroblock 3 decompressed with errors are used for intrapredictions in timed image sequences such as videos, the error can be compounded over a long running period of the video.

In video image sequences, macroblocks of the images, which can be luminance areas of 16×16 pixels, exist not only in a spatial but also in a time-related context.

For software applications in computer systems as well as for video conferencing applications, image recordings of multiple timed sequences, i.e., time values of assigned images, are included and the timed sequences are synchronized with each other. In this way, in video conferences, each video conference participant or each group of video conference participants records videos that comprise image recordings of a timed sequence and usually related sound recordings, plus other recordings if applicable. These recordings are synchronized in time, so that the video conference participants or groups of video conference participants can play the recordings of some or all video conference participants or groups of video conference participants simultaneously.

The recordings of video conference participants or groups of video conference participants are generally sent by video conference applications via data networks to a central server, such as a multipoint control unit (MCU), where they are compiled and sent back via data networks to the video conference participants or groups of video conference participants. To do this, recording data must be compressed, sent to a central server, compiled there, sent from there in compressed form to the participants or groups of participants, and then can be decompressed and decoded.

For transmitting and sending data via data networks, the data volumes transmitted or sent determine the requirements for hardware and network resources, and these are limiting factors for video conference applications. Compression of recorded data is therefore essential.

When synchronous video image sequences from different video conference participants are compiled in compressed form in a multipoint control unit and sent to all video conference participants, the problems described above with reference to FIGS. 1 a and 1 b appear. A compressed macroblock 3 for an image from a received video image sequence can be placed adjacent to a macroblock 7 of a synchronous image from another video image sequence, and then after reception and decoding or decompression of the compiled video image sequences at the video conference participant's end, errors occur with the resulting poor image quality.

According to the current state of the art, compressed recording data are received by a multipoint control unit, where they are fully decompressed, compiled, then recompressed and finally sent out. Such methods have high hardware resource requirements and can also involve unacceptable transmission delays. There are known encoding standards for compressing video data, such as H.264/MPEG-4 AVC in particular. In video standard H.264/AVC, the following intrapredictions for luma probes are defined:

-   -   8 directional intraprediction modes plus one DC prediction mode         for blocks with 4×4 pixels,     -   8 directional intraprediction modes plus one DC prediction mode         for blocks with 8×8 pixels,     -   3 directional intraprediction modes plus one DC prediction mode         for blocks with 16×16 pixels.

The directional intraprediction modes use an area 5, as illustrated in FIGS. 1 a and 1 b, with a row of pixels from blocks 7, 2, which are adjacent to intraprediction block 3, above or to the left of the related intraprediction block 3. These directional intraprediction modes instruct at least some of these pixels to be copied to intraprediction block positions that lie in a predetermined direction from the copied pixel. These directions run downward, to the right, or in varying diagonals from the sides of the L-shaped area 5 shown in FIGS. 1 a and 1 b to intraprediction block 3. For the DC prediction modes, an average color value for pixels in an area 5 shown in FIGS. 1 a and 1 b is used to predict the color value of all pixels in the intraprediction block. Especially in these DC prediction modes, the decompression errors mentioned above with reference to FIGS. 1 a and 1 b have a strong detrimental effect on image quality.

Newer transfer formats corresponding to H.264/AVC for video data, allow the compiling of synchronous video image sequences from different video conference participants as groups of compressed macroblocks, in which they encode information that prohibits intrapredictions, i.e., intraprediction modes for macroblocks in a video image sequence for a video conference participant, that refer to pixels outside of the edges of the images in this video image sequence. However, these transfer formats are not universally applicable for decoders that do not process this information. With the current state of the art, video conference participants who do not use decoders that decode information prohibiting certain intrapredictions cannot receive many video conferences. Supplying numerous video conference participants with suitable new decoders is an inefficient solution to this problem, especially in view of the costs involved.

BRIEF SUMMARY OF THE INVENTION

We would like to send compressed recordings of video data efficiently, especially with respect to the hardware and/or software resources required. Preferably such compilations could be handled easily by a central server without first decompressing them and to make them decodable by many decoder types.

Embodiments may address this goal with a method and a functional data structure for encoding a set of at least two digital images that are divided into macroblocks of pixels encoded with color value specifications, including intraprediction macroblocks, whereby each image is compressed in at least

-   -   a first data stream portion, which comprises a portion of the         macroblocks reduced by at least spatial redundancies, and     -   a second data stream portion that is assigned to the first data         stream portion and describes the redundancies,         wherein, for each of the intraprediction macroblocks,     -   the first data stream portion is reduced by color value         specifications with correlations to color values from at least         one row of pixels, which is outside of and assigned to one edge         of the intraprediction macroblock and for which, in the case of         pixels outside of the compressed image, a default color value is         assumed,     -   and the second data stream portion comprises intrapredictors for         describing the correlations.

The color values used can be luminance and/or chrominance values.

The macroblocks can consist of square spaces of the same size, each with the same number of pixels.

The first data stream portion can be reduced by color value specifications with correlations to color values from at least one row of pixels, which is located outside and at one edge of the intraprediction macroblock. In particular, the row of pixels can be located at an upper or left edge of the macroblock.

The invention includes the encoding of one space, which is divided into first spaces, each of which is occupied by the macroblocks for one of the digital images, and a second space that separates the first spaces from each other and is occupied by pixels with the default color value.

This encoding according to the invention prevents decompression errors in intrapredictions, because the color value assignments for a default color value, as described above with reference to FIG. 1 a for macroblocks located at the edge of an image, and also for decoding compiled images, are guaranteed because the second space separates the first spaces from each other and has pixels with the default color value.

The macroblocks preferably consist of square spaces of the same size, each with the same number of pixels, and the second space separates every two first spaces, parallel at a corresponding distance to one of the square macroblocks. In this way data are prepared that are especially well suited to standard decoding with decompression in blocks.

The L-shaped area 5 illustrated in Fig. la with pixels used as predictions for the intraprediction block 3, extends beyond this intraprediction block 3 on the right in FIG. 1 a. When the digital images are compressed, the default color value can be assumed for pixels in the area that extends beyond the intraprediction block 3 and can be used for an intraprediction. To prevent decompression errors when decoding a set of images, for which the overextending area extends into an image in that set, preferably the second space also has an edge of the space that is divided into the second and first spaces.

In particular, if the invention is applied to video conference systems, the digital images are time-synchronized images with differently timed image sequences, especially data sets of I-frames from video image recordings.

The first data stream portion is then advantageously reduced by space and time redundancies, wherein at least one method in particular is used, which is selected from a compression based on a frequency analysis, especially by discrete cosine transformation, and/or on the basis of quantification and/or entropy encoding.

The method and data structures described above can be used in multiplex methods to compile digital video signals and can be implemented as encoding software, which preferably comprise

-   -   a unit for receiving the compressed images via a         telecommunication network and     -   a unit for sending data encoded by the encoding software via a         telecommunication network.

Software or data structures according to the invention are stored on data storage media.

A data processing system with such a data storage medium, which is equipped with encoding software per the invention, is also one aspect of the invention.

BRIEF DESCRIPTION OF THE FIGURES

Aspects and an exemplary embodiment of the invention are described below with reference to the figures, in which:

FIGS. 1 a and 1 b show schematic illustrations of macroblocks in different parts of a digital image and FIG. 2 shows a schematic illustration of a data structure according to the invention.

FIGS. 1 a and 1 b illustrate the design of a functional data structure for encoding a set of two digital images in a conventional multipoint control unit. The digital images are divided into macroblocks 1, 2, 3, 6, and 7 of pixels encoded with position and color value specifications, including intraprediction macroblock 3, wherein each image is compressed. One of the images with macroblocks 1, 2, 3 at its edge 4 is received by the multipoint control unit and compiled with a data set with macroblocks from another digital image having macroblocks 6 and 7 in such a way that the macroblocks 7 contain pixels to which intrapredictors for the intraprediction macroblock 3 refer. Such a compilation is avoided in the functional data structure for encoding a set of four digital images according to the invention, shown in FIG. 2.

LIST OF REFERENCE NUMBERS

-   1, 2, 6, 7 Macroblocks -   3 Intraprediction macroblock -   4 Edge -   5 Pixel area for intrapredictions -   8 first space -   9 second space

DETAILED DESCRIPTION OF THE INVENTION

This functional data structure according to embodiments of the invention is created in a multipoint control unit according to the invention with encoding for a space that is divided into first spaces 8, each of which is occupied by macroblocks 1 and 3 for each of the four digital images, and a second space 9 that separates the first spaces 8 from each other. The macroblocks 1 and 3 are compressed and comprise intraprediction macroblocks 3, which are reduced by color value specifications with correlations to color values from at least one row of pixels, which is outside of and assigned to one edge of the intraprediction macroblock and for which, in the case of pixels outside of the compressed image, a default color value is assumed. All pixels in the second space 9 have this default color value.

The multipoint control unit according to the invention receives four digital images, each of which is divided into macroblocks of pixels encoded with position and color value specifications, including intraprediction macroblocks, wherein each image is compressed according to the H.264/AVC standard.

The macroblocks 1 and 3 have square spaces of the same size, each with the same number of pixels, and the second space 9 separates every two of the first spaces 8 parallel at a distance corresponding to one of the square macroblocks 1 and 3. In the first spaces 8, time-synchronized images of different data sets of frames from video image recordings are arranged by four video participants. The video image recordings are reduced by space and time redundancies corresponding to a combination of compressions according to the H.264/AVC standard.

A timed sequence of sets of four compressed images as shown in FIG. 2, corresponding to the time-synchronized sequences of the video image recordings received by the four video participants, comprises data sets with the data structure illustrated in FIG. 2. This timed sequence of sets of compressed images is sent by the multipoint control unit. For receiving and sending, the multipoint control unit has

-   -   a unit for receiving the compressed images via a         telecommunication network and     -   a unit for sending data encoded by the encoding software via a         telecommunication network.

The data sent can be received and displayed with high quality by a large number of decoders.

Although the figures refer to a video conference application, the invention is generally applicable to any applications that involve the preparation of sets of compressed image data, at least part of which can be compressed by intraprediction. Such applications are especially interesting for web service images offered on the Internet. 

1. A method for encoding a set of at least two digital images that are divided into macroblocks of pixels encoded with color value specifications, including intraprediction macroblocks, comprising compressing each image in at least a first data stream portion, comprising a portion of the macroblocks reduced by at least spatial redundancies, and a second data stream portion that is assigned to the first data stream portion and describes the redundancies, wherein, for each of the intraprediction macroblocks, the first data stream portion is reduced by color value specifications with correlations to color values from at least one row of pixels, which is outside of and assigned to one edge of the intraprediction macroblock and for which, in the case of pixels outside of the compressed image, a default color value is assumed, and the second data stream comprises intrapredictors to describe the correlations, distinguished by the encoding of one space, which is divided into first spaces, each of which is occupied by the macroblocks for one of the digital images, and a second space that separates the first spaces from each other and is occupied by pixels with the default color value.
 2. The method of claim 1, wherein the macroblocks have square spaces of the same size, each with the same number of pixels.
 3. The method of claim 2, wherein the second space separates every two of the first spaces parallel at a distance corresponding to one of the square macroblocks.
 4. The method of claim 1, comprising locating the second space on one edge of the space divided into the second and first spaces.
 5. The method of claim 1, comprising reducing the first data stream is reduced by color value specifications with correlations to color values from at least one row of pixels, which is outside of and assigned to one edge of the intraprediction macroblock.
 6. The method of claim 5, comprising the row of pixels to an upper or left edge.
 7. The method of claim 1, wherein the digital images are time-synchronized images with differently timed image sequences.
 8. The method of 7, comprising reducing the first data stream by space and time redundancies.
 9. The method of claim 8, comprising reducing the first data stream by at least one method which is selected from the group consisting of compression based on a frequency analysis, compression based on quantification, and compression based on entropy coding.
 10. A multiplex method for compiling digital video signals, comprising implementing the method of claim 7 and compiling digital video signals.
 11. Encoding software comprising a unit with program steps that, when performed, carry out the method of claim
 1. 12. The encoding software of claim 11, comprising: a unit for receiving compressed images via a telecommunication network and a unit for sending data encoded by the encoding software via the telecommunication network.
 13. A functional data structure for encoding a set of at least two digital images that are divided into macroblocks of pixels encoded with position and color value specifications, including intraprediction macroblocks, wherein each image is compressed in at least a first data stream portion, which includes a portion of the macroblocks reduced by at least spatial redundancies, and a second data stream portion that is assigned to the first data stream portion and describes the redundancies, wherein, for each of the intraprediction macroblocks, the first data stream portion is reduced by color value specifications with correlations to color values from at least one row of pixels, which is located outside and at one edge of the intraprediction macroblock and for which, in the case of pixels outside of the compressed image, a default color value is assumed, and the second data stream comprises intrapredictors to describe the correlations, distinguished by the encoding of one space, which is divided into first spaces, each of which is occupied by the macroblocks for one of the digital images, and a second space that separates the first spaces from each other and is occupied by pixels with the default color value.
 14. A data storage medium storing, encoding software of claim
 11. 15. A processing system comprising encoding software of claim 11, wherein said encoding software is present on a storage medium. 